Distributed adaptive Huber regression
نویسندگان
چکیده
Distributed data naturally arise in scenarios involving multiple sources of observations, each stored at a different location. Directly pooling all the together is often prohibited due to limited bandwidth and storage, or privacy protocols. A new robust distributed algorithm introduced for fitting linear regressions when are subject heavy-tailed and/or asymmetric errors with finite second moments. The only communicates gradient information iteration, therefore communication-efficient. To achieve bias-robustness tradeoff, key novel double-robustification approach that applies on both local global objective functions. Statistically, resulting estimator achieves centralized nonasymptotic error bound as if were pooled came from distribution sub-Gaussian tails. Under ( 2 + δ ) -th moment condition, Berry-Esseen established, based which confidence intervals constructed. In high dimensions, proposed doubly-robustified loss function complemented ℓ 1 -penalization sparse models data. Numerical studies further confirm compared extant methods, methods near-optimal accuracy low variability better coverage tighter width.
منابع مشابه
Active Regression with Adaptive Huber Loss
This paper addresses the scalar regression problem presenting a solution for optimizing the Huber loss in a general semi-supervised setting, which combines multi-view learning and manifold regularization. To this aim, we propose a principled algorithm to 1) avoid computationally expensive iterative solutions while 2) adapting the Huber loss threshold in a data-driven fashion and 3) actively bal...
متن کاملSparse Quantile Huber Regression for Efficient and Robust Estimation
We consider new formulations and methods for sparse quantile regression in the high-dimensional setting. Quantile regression plays an important role in many applications, including outlier-robust exploratory analysis in gene selection. In addition, the sparsity consideration in quantile regression enables the exploration of the entire conditional distribution of the response variable given the ...
متن کاملDistributed Multinomial Regression
This article introduces a model-based approach to distributed computing for multinomial logistic (softmax) regression. We treat counts for each response category as independent Poisson regressions via plug-in estimates for fixed effects shared across categories. The work is driven by the high-dimensional-response multinomial models that are used in analysis of a large number of random counts. O...
متن کاملOutlier Detection in Regression Using an Iterated One-Step Approximation to the Huber-Skip Estimator
In regression we can delete outliers based upon a preliminary estimator and reestimate the parameters by least squares based upon the retained observations. We study the properties of an iteratively defined sequence of estimators based on this idea. We relate the sequence to the Huber-skip estimator. We provide a stochastic recursion equation for the estimation error in terms of a kernel, the p...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Computational Statistics & Data Analysis
سال: 2022
ISSN: ['0167-9473', '1872-7352']
DOI: https://doi.org/10.1016/j.csda.2021.107419